Diagnostics for Canonical Correlation

نویسنده

  • Denny Meyer
چکیده

Canonical correlation analysis is a versatile multivarite technique that is prone to distortion as a result of correlation outliers. The detection and treatment of such outliers is complicated by outlier masking effects. Methods that check the effect of one observation at a time are therefore unsuccessful as diagnostic tools. In this paper we suggest that an approach involving the robust estimation of correlation matrices be used for canonical correlation analysis, with the robustness weights used to identify outliers. We apply this approach to the full correlation matrix before performing a canonical correlation analysis, and then we apply this approach to the canonical variate scores after performing an initial canonical correlation analysis. Real and simulated examples suggest that, provided an appropriate weighting system is chose, the last approach produces the best performance in terms of detecting canonical correlation outliers and containing them. Introduction In canonical correlation analysis components are extracted from two sets of variables in such a way as to maximise the correlation between these components. When one of the variable sets consists of indicator variables canonical correlation analysis is equivalent to a discriminant analysis. When both sets of variables consist of indicator variables canonical correlation analysis is equivalent to a correspondence analysis. Canonical correlation has also been used to develop structured time series models (Harvey, 1989). In this case the two sets of variables are identical except that the predictor set is lagged by one period. More recently Johansen (1996) has used canonical correlation to test for cointegration between time series. In econometric and financial data it is often important to determine whether variables have a longterm relationship. Such variables are said to be cointegrated and the relationship between them may be described using an error correction model. Consider a number of time series measured at regular intervals of time (e.g. monthly). Most econometric variables move around quite a bit and are therefore said to display non-stationary behaviour. When variables are co-integrated it is possible to find linear combinations of these variables that are stationary. When there are only two time series it is easy to test for cointegration. This is done by regressing the output series on the input series and testing the residuals for stationarity. If the residuals are stationary it means that the two series are cointegrated and the relationship between the two series is expressed using an Error Correction Model. In its simplest form this model is given by the equation t t t t t e b a b a + − + ∆ = ∆ − − ) ( 1 1 1 0 β γ γ In this equation at denotes the output series, bt denotes the input series and the series et is a white noise process of independent errors. The symbol ∆ is the difference operator used to indicate the change in a variable from one period to the next. In this equation the vector (1 -β) is referred to as the co-integrating vector because it produces a stationary process despite the fact that x and y are non-stationary. When there are more than two non-stationary time series it is not so easy to test for cointegration and one needs to use canonical correlation analysis. This approach is also reputed to give more reliable results for the two series case. However, to the best of the author’s knowledge the effect of outliers in this comparison has not been investigated. A brief description of canonical correlation analysis follows. If the two variable sets are labelled x and y the solution to a canonical correlation analysis can be expressed as the solution to the following equations 80 R.L.I.M.S. Vol. 4, May 2003

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Canonical Correlation Analysis for Determination of Relationship between Morphological and Physiological Pollinated Characteristics in Five Varieties of Phalaenopsis

Phalaenopsis is an important genus of orchids that is grown for economical production of cut flower and potted plants. The objective of this study is the evaluation of correlation between morphological and physiological traits of self and cross-pollination of 5 varieties of Phalaenopsis orchid. Some morphological traits were measured: Capsule length (CL), capsule volume (CV), weight of seeds in...

متن کامل

Canonical Analysis of the Relationship between Components of Professional Ethics and Dimensions of ‎Social Responsibility‌ ‌

  Background: Today, professional ethics and social responsibility play an important role in ‎organizations. This study aimed canonical analysis of the relationship between components ‎of professional ethics and social responsibility dimensions among the first high ‎school teachers in the Naghadeh province.‎‏ ‏ Method: This study, in terms of purpose is application, and in terms of data ‎collec...

متن کامل

Generalization of Canonical Correlation Analysis from Multivariate to Functional Cases and its related problems

In multivariate cases, the aim of canonical correlation analysis (CCA) for two sets of variables x and y is to obtain linear combinations of them so that they have the largest possible correlation. However, when x and y are continouse functions of another variable (generally time) in nature, these two functions belong to function spaces which are of infinite dimension, and CCA for them should b...

متن کامل

تحلیل روابط بین درصد و موقعیت شیب اراضی و اجزای عملکرد جو با استفاده از تکنیک هم‌بستگی کانونی

The present study was conducted in Mollaahmad watershed of Ardabil city in 1390 to evaluate the canonical correlation between tillage erosion [slope gradient and position], and grain yield and barley yield components (Sahand cultivar) in order to determine the covariability between the two sets of variables to (1) detect simultaneously occurring patterns in the interdependencies sets of canonic...

متن کامل

The canonical correlation between Contingencies self-worth and adjustment of students

This research aimed at studying the canonical correlation between Contingencies self-worth (Family support, Competition, Appearance, God’s love, Academic competence, Virtue, Approval from others) with adjustment (emotional, Social and academic). In order of this research, 221 university students were selected by random ratio sampling method (1272 cases). Data was gathered through contingencies ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003